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The present invention relates to apparatus and methods for encoding and 
decoding information in multimedia signals, such as audio, video or data signals. 

Watermarking of multimedia signals is a technique for the transmission of 
5 additional data along with the multimedia signal. For instance, watemiarking techniques can 
be used to embed copyright and copy control information into audio signals. 

The main requirement of a watermarking scheme is that it is not observable 
(i.e. in the case of an audio signal, it is inaudible) whilst being robust to attacks to remove the 
watermark from the signal (e.g. removing the watermark will damage the signal). It will be 

10 appreciated that the robustness of a watermark will normally be a trade off against the quality 
of the signal in which the wate • tibedded. For instance, if a watermark is strongly 

embedded into an audio s> ^ j :fficult to remove) then it is likely that the 

quality of the audio signal \vA 

Various types of auaio watermarking schemes have been proposed, each with 

15 its own advantages and disadvantages. For instance, one type of audio watermarking scheme 
is to use temporal correlation techniques to embed the desired data (e.g. copyright 
information) into the audio signal. This technique is effectively an echo-hiding algorithm, in 
which the strength of the echo is determined by solving a quadratic equation. The quadratic 
equation is generated by auto-correlation values at two positions: one at delay equal to t, and 

20 one at delay equal to 0. In such a scheme, as echoes of the audio signal are added to the 

original audio signal, the resulting signal is in fact both an amplitude and a phase modulated 
version of the original audio signal. At the detector, the watermark is extracted by 
determining the ratio of the auto correlation function at the two delay positions. 

This correlation technique has a number of drawbacks. For instance, it is only 

25 possible to embed the watermark where the resulting quadratic equation has real roots, and 
consequently this reduces the robustness (ability of the watermark to withstand attacks) for a 
given audio quality. Further, the perfomiance of the correlation algorithm is dependent upon 
the value of the delay r and the characteristics of the original signal. This is a significant 
drawback. 
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Also known are watermarking schemes based on the amplitude modulation of DFT (Discrete 
Fourier Transform) coefficients. As such schemes require the calculation of DFTs at both the 
encoder and the decoder, the resulting hardware for implementing such DFT schemes tends 
to be relatively complex, and hence the scheme tends to be slow to perform and costly. 
5 Further, watermarks cannot be satisfactorily embedded in audio segments that have sparse 
frequency characteristics, and hence the DFT scheme does not work well with particular 
types of music. 

WO 00/00969 describes an alternative technique for embedding or encoding 
10 auxiliary signals (such as copyright information) into a multimedia host or cover signal. A 
replica of the cover signal, or a portion of the cover signal in a particular domain (time, 
frequency or space), is generated according to a stego key, which specifies modification 
values to the parameters of the cover signal. The replica signal is then modified by an 
auxiliary signal corresponding to the information to be embedded, and inserted back into the 
15 cover signal so as to fomi the stego signal. 

At the decoder, in order to extract the original auxiliary data, a replica of the 
: A-.r) J. . aerated in the same manner as the replica of the original cover signal, and 
rf^:*\iA - I ^ .se of the same stego key. The resulting replica is then correlated with the 
received stego signal, so as to extract the auxiliary signal. The extraction of the auxiliary 
20 signal is thus relatively complex, and requires the stego key at both the encoder (or 
embedder) and decoder (or detector). Additionally, a bmte force search is required to 
synchronize to the auxiliary signal at the detector. 

Further, performance of the payload extraction is dependent on how well the 
auxiliary signal can be estimated. In a system with a high expected error rate of the payload 
25 bits in the auxiliary signal, this is very difficult to achieve. Solutions would lead to very 
complex error correction methods, or significantly limit the information capacity. 

It is an object of the present invention to provide a watermarking scheme that 
substantially addresses at least one of the problems of the prior art. 
30 In a first aspect, the present invention provides a method of generating a 

watermark signal for embedding in a multimedia signal, the method comprising the steps of: 
(a) generating two sequences of values, the second sequence being a circularly shifted 
version of the first sequence; and (b) generating a watermark signal by adding the values of 
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the first sequence to the respective values in the corresponding positions of the second 
sequence. 

Preferably, each value of the furst and second sequences is represented by a 
pulse of preferable width Ts so as to form rectangular wave signals. 
5 Preferably, in step (a) a window shaping function is applied to convert each of 

the rectangular signals into respective smoothly varying signals, with the resulting smoothly 
varying signals being added in step (b) to fonn the watermark signal. 

Preferably, each one of said sequences of values is convolved with a window 
shapmg function which has a width of at least Ts, so as to generate two smoothly varying 
10 signals, these smoothly varying signals being added together in step (b) so as to form the 
watemiark signal. 

Preferably, said window shaping function has a band limited firequency 
behavior and a smooth temporal behavior. 

Preferably, said window shaping function has a symmetric or anti-synmietric 

1 5 temporal behavior. 

Preferably, said window shapmg function comprises at least one of a raiser- 
cosine function and a bi-phase function. 

Preferably, the watermark signal is generated by the addition of the two - 
smoothly varying signals with a relative delay of Tr, where Tr < Tg. 
20 Preferably, Tr is chosen such that maximum amplitude points of the first 

smoothly varying signal coincide with zero-crossings of the second smoothly varying signal, 
and vice- versa. 

Preferably, said watermark signal has a payload that is encoded in the 
combination of said two sequences of values. 

25 In another aspect, the present invention provides an apparatus arranged to 

generate a watermark signal for embedding in a multimedia signal, the apparatus comprising: 
(a) a sequence generator arranged to use a first sequence of values to generate a second 
sequence of values, the second sequence being a circularly shifted version of the first 
sequence; and (b) a signal generator arranged to generate a watermark signal by adding the 

30 values of the first sequence to the respective values in the corresponding positions of the 
second sequence. 

Preferably, the apparatus further comprises a signal conditioner arranged to 
convert each sequence of values into a smoothly varying signal. 
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Preferably, the apparatus is arranged to generate said first sequence of values 
by circularly shifting a primary sequence of values. 

In a further aspect, the present invention provides a method of embedding a 
watermark in a multimedia signal, the method comprising the steps of: (a) generating a 
5 watermark signsil equal to the sum of two sequences of values, the second sequence being a 
circularly shifted version of the first sequence of values; (b) generating a host modifying 
multimedia signal as a product of the watermark signal and the multimedia signal; (c) 
generating a watermarked multimedia signal by adding a scaled version of said host 
modifying multimedia signal to the multimedia signal. 
1 0 Preferably, said scaled version of the host modifying signal is generated by 

controlling the scaling factor by a predetermined cost-function. 

Preferably, said cost function comprises multiple scaling factors, each scaling 
factor being defined separately for one or more of the plurality of fi'equency bands in the 
multimedia signal. 

15 Preferably, said firequency bands are determined according to a model of the 

hiunan auditory and/or visual system. 

Preferably, in step (b) said host modifying m : :i \r . v is generated by 
multiplying said watermark signal with an extracted portion c :: ' " ■ 'rnedia signal. 

Preferably, said extracted portion of the multimedia signal is obtained by 
20 filtering at least a portion of the multimedia signal with respect to at least one of firequency, 
space and time. 

The method preferably further comprises the steps of: (d) generating a second 
watermark signal equal to the sum of a third and a fourth sequences of values, the fourth 
sequence being a circularly shifted version of the third sequence of values; (e) extracting a 

25 second portion of the multimedia signal, the second portion being filtered such that it does 
not overlap with said first portion; (f) generating a watermarked multimedia signal by adding 
the product of the second watermark signal and the second extracted portion of the 
multimedia signal to the watermarked multimedia signal. 

In another aspect the present invention provides an apparatus arranged to 

30 embed a watermark signal in a multimedia signal, the apparatus comprising; (a) a watermark 
generator arranged to generate a signal equal to the sum of two sequences of values, the 
second sequence being a circularly shifted version of the first sequence of values; (b) an 
output signal generator arranged to generate a watermarked multimedia signal by adding the 
product of the watermark signal and the multimedia signal to the multimedia signal. 
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Preferably, the apparatus further comprises a signal extractor arranged to 
extract a first portion of the multimedia signal. 

In a further aspect the present invention provides a multimedia signal 
comprising a watermark, wherein the original multimedia signal has been watermarked by 
5 modifying the temporal envelope of the original signal by the watermark, the watermark 
comprising the sum of a first and a second sequences of values, the second sequence of 
which is a circularly shifted version of the first sequence. 

In another aspect the present invention provides a method of detecting a 
watermark signal embedded in a multimedia signal, the method comprising the steps of: (a) 
10 receiving a multunedia signal that may potentially be watermarked by a watermark signal 
modifying the temporal envelope of the host multimedia signal; (b) extracting an estimate of 
the watermark fixjm said received signal; and (c) correlating the estimate of the watermark 
with a reference version of the watermark so as to determine whether the received signal was 
watemiarked. 

15 Preferably, the watermark signal has a payload, and the method further 

comprises the step of determining X^. : . of the watermark. f 

In a further aspe? ^ v : j; . ention provides a watermark detector 
apparatus arranged to detect whether v • ^xbiark signal is embedded within a multimedia 
signal, the watermark detector comprising: (a) a receiver arranged to receive a multimedia 

20 signal that may potentially be watermarked by a watermark signal modifying the temporal 
envelope of the host multimedia signal; (b) an extractor arranged to extract an estimate of the 
watemiark from said received signal; and (c) a correlator arranged to correlate the estimate of 
the watermark with a reference version of the watermark so as to determine whether the 
received signal was watermarked. 

25 Preferably, the apparatus further comprises a payload detector arranged to 

determine if a payload is present within said watermark and to determine the value of said 
payload. 

For a better understanding of the invention, and to show how embodiments of 
30 the same may be carried into effect, reference will now be made, by way of example, to the 
accompanying diagranmiatic drawings in which: 

Figure 1 is a diagram illustrating a watermark embedding apparatus in 
accordance with an embodiment of the present invention; 
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Figure 2 shows a signal portion extraction filter H used in one preferred 

embodiment; 

Figures 3a and 3b show respectively the typical amplitude and phase responses 
as a function of firequency of the filter H used in Fig. 2; 
5 Figure 4 shows the payload embedding and watermark conditioning stage; 

Figure 5 is a diagram illustrating the details of the watermark conditioning 
apparatus He of Fig. 4, including charts of the associated signals at each stage; 

Figure 6a and 6b show two preferred alternative window shaping functions 
s(n) in the form of respectively a raised cosine fiinction and a bi-phase Amotion; 
10 Figures 7a and 7b show respectively the firequency spectra for a watermark 

sequence conditioned with a raised cosine and a bi-phase shaping window Amotion; 

Figure 8 is a diagram illustrating a watermark detector in accordance with an 
embodiment of the present invention; 

Figure 9 diagrammatically shows the whitening filter Hw of Fig. 8, for use in 
15 conjunction with a raised cosine shaping window fimction; 

; . ire 1 0 diagrammatically shows the whitening filter Hw of Fig. 8, for use in : 
conjiv shase window shaping fimction; and 

.Ti«5ure 1 1 shows a typical shape of the correlation fimction output from the 
correlator of the watemiark detector shown in Fig. 8. 

20 

Fig. 1 shows a block diagram of the apparatus required to perform the digital 
signal processing for embedding a multi-bit payload watermark Wc into a host signal x in 
accordance with a preferred embodiment to the present invention. 

A host signal x is provided at an input 12 of the apparatus. The host signal x is 
25 passed in the direction of output 14 via the adder 22. However, a replica of the host signal x 
(input 8) is split off in the direction of the multiplier 18, for carrying the watemiark 
information. 

The watermark signal is obtained from the payload embedder and 
watermark conditioning apparatus (5, and derived from the watermark random sequence 
30 (input 4), which is input to the payload embedder and watermark conditioning apparatus. The 
multiplier 18 is utiUzed to calculate the product of the watermark signal w^and the repUca 
audio signal x. The resulting product, is then passed via a gain controller 24 to the adder 
22. The gain controller 24 is used to amplify or attenuate the signal by a gam factor a. 
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The gain factor a controls the trade off between the audibility and the 
robustness of the watermark. It may be a constant, or variable in at least one of time, 
frequency and space. The ^paratus in Fig. 1 shows that, when a is variable, it can be 
automatically adapted via a signal analyzing imit 26 based upon the properties of the host 
5 signal x. Preferably, the gain a is automatically adapted, so as to minimize the impact on the 
signal quality, according to a properly chosen perceptibility cost-function, such as a psycho- 
acoustic model of the human auditory system (HAS). Such a model is, for instance, described 
in the paper by E.Zwicker, "Audio Engineering and Psychoacoustics: Matching signals to the 
final receiver, the Human Auditory System", Journal of the Audio Engineering Society, Vol. 
10 39, pp. Vol.1 15-126, March 1991. 

In the following, an audio watermark is utilized, by way of example only, to 
describe this embodiment of the present invention. 

The resulting watermark audio signal y is then obtained at the output 14 of the 
embedding apparatus 10 by adding an appropriately scaled version of the product of iv^ and x 
IS to the host signal: 

y[n] = + oWcMxin] . (IV 

Preferably, the watermark Wc is chosen such that when multiplied with x, it 
predominantly modifies the short time envelope of x. 

Fig. 2 shows one preferred embodiment in which the input 8 to the multiplier 
20 1 8 in Fig, 1 is obtained by filtering the replica of the host signal x using a filter in the 
filtering unit 15. If the filter output is denoted by jc^,, then according to this preferred 
embodiment, the watermarked signal is generated by adding an appropriately scaled version 
of the product of Xb and the watermark Wc to the host signal x. 

Let xi, be defined such that 3c^ = - , and yi, be defined such that 

25 y = yb'^^bf then the watermarked signal can be written as 

y[n] = (1 + y^c[n])xt[n] + . (2) 

and the envelope modulated portion j;^ of the watermarked signal j; is given as 

yt[n] = (l + w,[n])xM (3) 



30 



Preferably, as shown in Fig. 3, the filter //^ is a linear phase band-pass filter 
characterized by its lower cut off frequency>i and upper cut off firequency/f/. As can be seen 
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in Fig. 3b, the filter i/ has a linear phase response with respect to frequency/within the pass 
band (BW). Thus, when His a band-pass filter, xt and are the in-band and out-of-band 
components of the host signal respectively. For optimum performance, it is preferable that 
the signals jc^, and x^, are in phase. This is achieved by appropriately compensating for the 

5 phase distortion produced by filter H. 

In Fig. 4, the details of the payload embedder and watermark conditioning unit 
6 is shown. In this unit the watermark seed signal w, is converted into a multi-bit watermark 
signal Wc. 

Firstly a finite length, preferably zero mean and uniformly distributed random 
10 sequence Ws is generated using a random number generator with an initial seed 5. As will be 
appreciated later, it is preferable that this initial seed S is known to both the embedder and the 
detector, such that a copy of the watermark signal can be generated at the detector for 
comparison purposes. This results in the sequence of length Lw 

^sW e [-U], for k=0,l,2, L^l (4) 
Then the sequence w, is circularly shifted by the amounts dj and d2 using the 

15 circularly shifting units 30 to obtain the random sequences Wrfy and ly. It will be'- 

appreciated that these two sequences (wdi and w^o) are effectiV^^: . - siid a 

second sequence, with the second sequence being circularly shifted wiL \ ptpt to the first. 5- 
Each sequence w^/, i = 1,2, is subsequently multiplied with a respective sign bit r/, in the 
multiplying unit 40, where n = +1 or -1 , the respective values of ri and rz remaining constant, 

20 and only changing when the payload of the watermark is changed. Each sequence is then 
converted into a periodic, slowly varying narrow-band signal w, of length LwTs by the 
watermark conditioning circuit 20 shown in Fig. 4. Finally, the slowly varying narrow-band 
signals w/ and are added with a relative delay Tr (where Tr<T^ to give the multi-bit 
payload watermark signal w^. This is achieved by first delaying the signal W2 by the amount 

25 Tr using delaying unit 45 and subsequently by adding it to wj with the adding unit 50. 

Fig. 5 shows in more detail the watermark conditioning apparatus 20 used in 
the payload embedder and watermark conditioning apparatus 6. The watermark random 
sequence is input to the conditioning apparatus 20. 

For convenience, the modification of only one of the sequences v/di is shown 

30 in Fig. 5, but it will be appreciated that each of the sequences is modified in a similar manner, 
with the results being added to obtain the watermark signal Wc. 
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As shown in Fig. 5, each watermark signal sequence WdifkJ, i-1.2 is apphed to 
the input of a sample repeater 180. Chart 181 illustrates one of the possible sequences w^i as a 
sequence of values of random numbers between +1 and -1, with the sequence being of length 
Lw, The sample repeater repeats each value within the watermark random sequence Ts times, 
5 so as to generate a pulse train signal of rectangular shape. Ts is referred to as the watermark 
symbol period and represents the span of the watermark symbol in the audio signal. Chart 
183 shows the results of the signal illustrated in chart 181 once it has passed through the 
sample repeater 180. 

A window shaping function sfnj, such as a raised cosine window, is then 
10 applied to convert the rectangular pulse signals derived from wai and into slowly varying 
signals wjfnj and W2fnJ respectively. 

Chart 184 shows a typical raised cosine window shaping function, which is 
also of period Tg. 

The generated signals w/fnj and W2[nJ are then added up with a relative delay 
15 Tr (where Tr<T^ to give the multi-bit payload watermark signal Wc[n] i.e. 

Ht [«] = wi [n\ + W2 [n - Tr] (5) 

The value of Tr is chosen j; r . '{o zero crossings of wy match the 
maximum amplitude points of W2 and vice-versa. Thus, for a raised cosine window shaping 
function Tr'^T^l, and for a bi-phase window shaping function Tr^Ts/4. For other window 
20 shaping functions, other values of Tr are possible. 

As will be appreciated by the below description, during detection the 
watermarked signal carrying Wc/>i/ will generate two correlation peaks that are separated by 
pL (as can be seen in Fig. 11). The value pL is part of the payload, and is defined as 

pL^d:,-d^mo^^y^ (6) 

25 In addition to pL, extra information can be encoded by changing the relative 

signs of the embedded watermarks. 

In the detector, this is seen as a relative sign Vsigjn between the correlation 
peaks. It will be seen that tsign can take four possible values, and may be defined as: 

r.^ = ''^t^^-'' e {0.1.2,3} (7) 
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where pj-sigfi(cLO and pf='sig^(cLz) are respectively estimates of the sign bits n (input 80) 
and r2 (input 90) of Fig. 4, and cLj and CL2 are the values of the correlation peak 
corresponding to w^/ and Wd2 respectively. The overall watermark payload pL^ is then given 
5 as a combination of rsign and pL : 

The maximxmi information (Imax), in number of bits, that can be carried by a 
watermark sequence of length Lw is thus given by: 

/„»=log2[4|^:;^])bits (9) 

10 In such a scheme, the payload is immune to relative offset between the 

embedder and the detector, and also to possible time scale modifications. The window 
shaping function has been identified as one of the main parameters that controls the 
robustness : :,r.//Vehavior of the present watermarking scheme. As illustrated in ^ 
Figs. 6 > * ; :i ^ v \ . .: . of possible window shaping functions are herein described - a 

15 raised cosine : .'i. : aad a bi-phase function. 

It is preferable to use a bi-phase window function instead of a raised cosine 
window function, so as to obtain a quasi DC-firee watermark signal. This is illustrated in 
Figs. 7 a and 7 b, which show the frequency spectra corresponding to a watermark sequence 
(in this case a sequence of Wdi[k] = {1,1,-1,1,-1,-1,}) conditioned with respectively a raised 

20 cosine and a bi-phase window shaping function. As can be seen, the frequency spectrum for 
the raised cosine conditioned watermark sequence has a maximum at frequency/ = 0, whilst 
the frequency spectrum for the bi-phase shaped watermark sequence has a minimum at/= 0 
i.e. it has very little DC component. 

Useful information is only contained in the non-DC component of the 

25 watermark. Consequently, for the same added watermark energy, a watermark conditioned 
with the bi-phase window will carry more useful information than one conditioned by the 
raised cosine window. As a result, the bi-phase window offers superior audibility quality for 
the same robustness or, conversely, it allows a better robustness for the same audibility 
quality. 
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Such a bi-phase function could also be utilized as a window shaping function 
for other watermarking schemes. In other words, a bi-phase function could be applied to 
reduce the DC component of signals (such as a watermark) that are to be incorporated into 
another signal. 

5 Fig. 8 shows a block diagram of a watermark detector (200, 300, 400). The 

detector consists of three major stages: (a) the watermark symbol extraction stage (200), (b) 
the buflFering and interpolation stage (300), and (c) the correlation and decision stage (400). 

In the symbol extraction stage (200), the received watermarked signal y*[n] is 
processed to generate multiple (A/^) estimates of the watermarked sequence, which are 

10 multiplexed into the signal WefmJ. These estimates of the watermark sequence are required to 
resolve any time offset that may exist between the embedder and the detector, so that the 
watermark detector can synchronize to the watermark sequence inserted in the host signal. 

In the buffering and interpolation stage (300), these estimates are de- 
multiplexed into Nb separate buffers. An interpolation is subsequently applied to each buffer 

15 to resolve possible timescale modifications that may have occurred. For instance, a drift in 
sampling (clock) fi-equency may result in a stretch or shrink in the time domain signal (i.e. 
Jie watermark may have been stretched or shrunk). 

In the correlation and decision stage (400), the content of each buffer is 
correlated with the reference watermark and the maximum correlation peaks are compared 

20 against a threshold to determine the likelihood of whether the watermark is indeed embedded 
within the received signal y 'fnj. 

In order to maximize the accuracy of the watermark detection, the watermark 
detection process is typically carried out over a length of received signal that is 3 to 4 
times that of the watermark sequence length. Thus each watermark symbol to be detected can 

25 be constructed by taking the averages of several symbols. This averaging process is referred 
to as smoothing, and the number of times the averaging is done is referred to as the 
smoothing factor Sf. Thus, the detection window length Lq is the length of the audio segment 
(in number of samples) over which a watermark detection truth-value is reported. 
Consequently, Ld=s/LwTs^ where Ts is the symbol period and Lw the number of symbols 

30 within the watermark sequence. Typically, the length (Lif} of each buffer 320 within the 
buffering and interpolation stage is Lt^s/Lw 

In the watermark symbol extraction stage 200 shown in Fig. 8, the incoming 
watermark signal y 'fnj is input to the signal conditioning filter H(,(210). This filter 210 is 
typically a band pass filter and has the same behavior as the corresponding filter H (IS) 
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shown in Fig. 2. The output of the filter Hf, is y 'i,[nj, and assuming linearity within the 
transmission chaimel, it follows from equations (2) and (3): 

y\ [«] « yti^] = (1 + «vvc 



(10) 



Note that when no filter is used in the embedder (i.e., when H=l) then Hb in 



5 the detector can also be omitted, or it can still be included to improve the detection 

performance. If Hb is omitted, then yb in equation (10) is replaced with y. The rest of the 
processing is the same. 



embedder and the detector (i.e. no offset and no change in timescale), and that the audio 
10 signal is divided into frames of length Ts, and that y'b,m[n] is the n-th sample of the m-th 

frame of the filtered signal y \[n]. It should be noted that if there is not perfect synchronism 
between the embedder and the detector, then any deviation can be compensated for within the 
buffering and interpolation stage 300 utilizing techniques known to the skilled person e.g. 
iteratively searching through all possible scale and offset modifications until a best match is 
15 achieved. 



where S[n] is the same window shaping function used in the watermark conditioning circuit 
of Fig. 5. A person skilled in the art will appreciate that equation 1 1 represents a matched 
20 filter receiver, and is the optimum receiver when the symbol period is perfectly synchronized. 
Not withstanding this fact, from now on, we set S[n]^l in order to simpUfy subsequent 
explanations. 



For simplification, it is assumed that there is perfect synchronism between the 



The energy £/>w7 corresponding to the y\mfnj frame : ;. 




01) 



Combining this with equation 10, it follows that: 




(12) 



25 



where WefmJ is the m-th extracted watermark symbol- and contains Nt, time-multiplexed 
estimates of the embedded watermark sequences. Solving for WefmJ in equation 12 and 
ignoring higher order terms of a, gives the following approximation: 
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2a 



2 ^ 
-1 



(13) 



In the watennark extraction stage 200 shown in Fig. 8, the output y'tfnj of the 
filter Hb is provided as an input to a frame divider 220, which divides the audio signal into 
frames of length Ts i.e. into y 'b.m[n], with the energy calculating unit 230 then being used to 
5 calculate the energy corresponding to each of the framed signals as per equation (11). The 
output of this energy calculation unit 230 is then provided as an input to the whitening stage 
Hy^ (240) which performs the ftinction shown in equation 13 so as to provide an output 
WefmJJ. Altemative implementations (240A, 240B) of this whitening stage are illustrated in 
Figs. 9 and 10. 

10 It will be realized that the denominator of equation 1 3 contains a term that 

requires knowledge of the host (original) signal x. As the signal x is not available to the 
detector, it means that in order to calculate We[mJ then the denominator of equation 13 must 
be estimated. 

Below is described how sur • ? n be achieved for the two 

1 5 described window shaping functions (the rais r * v -ivindow shaping function and the bir 
phase window shaping function), but it will equally be appreciated that the teaching could be 
extended to other window shaping functions. 

In relation to the raised cosine window shaping function shown in Fig. 6a, it 
has been realized that the audio envelope induced by the watermark contributes 
predominantly to the noisy part of the energy function EfmJ. The slowly varying part (i.e. the 
low frequency components) is predominately due to the contribution of the envelope of the 
original audio signal x. Thus, equation 13 may be approximated by: 



Zee 



E[m] 
lowpass(£[/«]) 



-1 



(14) 
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where "lowpass (.)** is a low pass filter function. Thus, it will be appreciated that the 
whitening filter for the raised cosine window shape in the fiinction can be realized as 
shown in Fig. 9. 

As can be seen, such a whitening filter //w (240A) comprises an input 242A 
5 for receiving the signal E[m]. A portion of this signal is then passed through the low pass 
filter 247A to produce a low pass filtered energy signal Eipfmjy which in turn is provided as 
an input to the calculation stage 248A along with the fimction EfmJ. The calculation stage 
248 A then divides EfmJ by Etpfm] to calculate the extracted watermark symbol We[m]. 

When a bi-phase window function is employed in the watermark conditioning 
10 stage of the embedder, a different approach should be utilized to estimate the envelope of the 
original audio, and hence to calculate WefmJ. 

It will be seen by examination of the bi-phase window function shown in Fig. 
6b, that when the envelope of an audio firame is modulated with such a window function, the 
first and the second halves of the firame are scaled in opposite directions. In the detector, this 
15 property is utilized to estimate the envelope energy of the host signal x. 

Cons'=*'" / ^% within the detector, the audio firame is first sub-divided into two 
halves. The enr . . - ^ : responding to the first and second halve firames are hence 
given by 

^i['«]='^z'K«.wr (15) 

and 

Mm]^ '^ly.Anf (16) 

20 

respectively. As the envelope of the original audio is modulated in opposite directions within 
the two sub-firames, the original audio envelope can be approximated as the mean of EjfmJ 
zndE2fmJ. 

Further, the instantaneous modulation value can be taken as the difference 
25 between these two functions. Thus, for the bi-phase window function, the watermark WefmJ 
can be approximated by: 
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^ 2a{E,[m]-^E2lm] ) 

Consequently, the whitening filter Hw 240B for a bi-phase window shaping 
function can be realized as shown in Fig. 10. Inputs 242B and 243B respectively receive the 
energy functions of the first and second halve fi-ames EjfmJ and E2fmJ. Each energy function 

5 is then split up into two, and provided to adders 245B and 246B which respectively calculate 
EjfmJ -E2fmJ, and E/frnJ + E2fmJ. Both of these calculated functions are then passed to the 
calculating unit 248B which divides the value fi-om adder 245B by the value firom 246B so as 
to calculate the watermark WefmJ, in accordance with equation 17. 

This output WefmJ is then passed to the buffering and interpolation stage 300, 

1 0 where the signal is de-multiplexed by a de-multiplexer 310, buffered in buffers 320 of length 
Li, so as to resolve any lack of synchronism between the embedder and the detector, and 
interpolated within the mterpolation unit 330 so as to compensate for any time scale 
modification between the embedder and the detector. Such compensation can utilize known 
: ' ' niques, and hence is not described in any more detail within this specification. } 
As shown in Fig. 8, outputs (woh y^02, ^^dnO from the buffering stage are 
passed to the interpolation stage and, after interpolation, the outputs (wiu w/yv^ of this 
stage, which correspond to the different estimates of the correctly re-scaled signal, are passed 
to the correlation and decision stage. If it is believed that no time scaling compensation is 
required, the values (y\^Du >vd2» "^om) can be passed directly to the correlation and decision 

20 stage 400 i.e. the interpolation stage 330 can be omitted fi^om the apparatus. 

The correlator 410 calculates the correlation of each estimate WijJ=l,.,.,Nb 
with respect to the reference watermark sequence Ws[k]. Each respective correlation output 
corresponding to each estimate is then applied to the maximum detection \mit 420 which 
determines which two estimates provided the best fits for the circularly shifted versions Wdi 

25 and w^z of the reference watermark. The correlation values (the peak amplitudes and 
positions) for these estimate sequences are passed to the threshold detector and payload 
extractor unit 430, 

If the interpolation stage is omitted, alternatively the correlator 410 calculates 
the correlation of each estimate wdj. j^l. . -,^6 with the reference watermark sequence w^/ife/ 
30 and the results are passed on for subsequent processing to the units 420 and 430 as outlined 
in the above paragraph. 
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The threshold detector and payload extractor unit 430 may be utilized to 
extract the payload (e.g. information content) from the detected watermark signal. Once the 
unit has estimated the two correlation peaks cLi and cL2 that exceed the detection threshold, 
the distance pL between the peaks (as defined by equation (6)) is measured. Next, the signs 
Pi and p2 of the correlation peaks are determined, and hence rsig„ calculated from equation 
(7). The overall watermark payload may then be calculated using equation (8). 

For instance, it can be seen in Fig. 1 1 that pL is the relative distance between 
the two peaks. Both peaks are positive i.e. p, = +1, and = +1. From equation (7), rag„ = 3. 

Consequently, the payload pLw = <3, pl>. 

The reference watermark sequence w, used within the detector corresponds to 
(a possibly circularly shifted version of) the original watermark sequence applied to the host 
signal. For instance, if the watermark signal was calculated using a random number generator 
with seed S within the embedder, then equally the detector can calculate the same random 
number sequence using the same random number generation algorithm and the same initial 
seed so as to determine the watermark signal. Alternatively, the watermark signal originally 
applied in the embedder and utilized by the detector as a reference could simr»i' • 

predetermined sequence. 

Fig. 1 1 shows a typical shape of a correlation fimction as outpi: 

correlator 410. The horizontal scale shows flie correlation delay (in terms of the sequence 
bins). The vertical scale on the left hand side (referred to as the confidence level cL) 
represents the value of the correlation peak normalized with respect to the standard deviation 
of the (typically normally distributed) correlation ftmction. 

As can be seen, the typical correlation is relatively flat with respect to cL, and 
centered about cL = 0. However, the ftmction contains two peaks, which are separated hypL 
(see equation 6) and extend upwards to cL values that are above the detection threshold when 
a watermark is present. When the correlation peaks are negative, the above statement appUes 

to their absolute values. 

A horizontal line (shown in the Fig. as being set at cL = 8.7) represents the 
detection threshold. The detection threshold value controls the false alarm rate. 

Two kinds of false alarms exist: the false positive rate, defined as the 
probability of detecting a watermark in non watermarked items, and the false negative rate, 
which is defined as the probability of not detecting a watermark in watermarked items. 
Generally, the requirement of the false positive alarm is more stringent than that of the false 
negative. The right hand side scale on Fig. 11 illustrates the probability of a false positive 
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alarm p. As can be seen, in the example shown, the probability of a false positive /?=7 CT is 
equivalent to the threshold cL = 8.7, v/hilst /? = 7(7*^^ is equivalent to cL = 20. 

After each detection interval, the detector determines whether the original 
watermark is present or whether it is not present, and on this basis output a "yes" or a **no" 

5 decision. If desired, to improve this decision making process, a number of detection windows 
may be considered. In such an instance, the false positive probability is a combination of the 
individual probabilities for each detection window considered, dependent upon the desired 
criteria. For instance, it could be determined that if the correlation function has two peaks 
above a threshold of cL = 7 on any two out of three detection intervals, then the watermark is 

10 deemed to be present. Obviously, such detection criteria can be altered depending upon the 
desired use of the watermark signal and to take into account factors such as the original 
quality of the host signal and how badly the signal is likely to be corrupted during normal 
transmission. 

It will be appreciated by the skilled person that various implementations not 
15 specifically described would be understood as falling within the scope of the present 
invention. For instance, whilst only the functionality/ . . ■ bedding and detectmg 
apparatus has been described, it will be appreci/::;, :r t & ri >r i^Xns could be realized as a 
digital circuit, an analog circuit, a computer prog: ' -bination thereof. ^^-y 

Equally, whilst the above embodiment has been described with reference to an 
20 audio signal, it will be appreciated that the present invention can be ^plied to other types of 
signal, for instance video and data signals. 

Within the specification it will be appreciated that the word "comprising" does 
not exclude other elements or steps, that "a" or "and" does exclude a plurality, and that a 
single processor or other unit may fiilfil the fimctions of several means re-cited in the claims. 
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